cf image
PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion
Kumar, Amar, Kriz, Anita, Havaei, Mohammad, Arbel, Tal
Developing reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, data imbalances, and limited text annotations in datasets. Addressing these challenges requires architectures robust to the unique complexities posed by medical imaging data. The rapid advancements in vision-language foundation models within the natural image domain prompt the question of how they can be adapted for medical imaging tasks. In this work, we present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals using Stable Diffusion. Our approach demonstrates unprecedented precision in selectively modifying spurious correlations (the medical devices) and disease features, enabling the removal and addition of specific attributes while preserving other image characteristics. Through extensive evaluation, we show how PRISM advances counterfactual generation and enables the development of more robust downstream classifiers for clinically deployable solutions. To facilitate broader adoption and research, we make our code publicly available at https://github.com/Amarkr1/PRISM.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance
Zhang, Weiyi, Huang, Siyu, Yang, Jiancheng, Chen, Ruoyu, Ge, Zongyuan, Zheng, Yingfeng, Shi, Danli, He, Mingguang
Fundus Fluorescein Angiography (FFA) is a critical tool for assessing retinal vascular dynamics and aiding in the diagnosis of eye diseases. However, its invasive nature and less accessibility compared to Color Fundus (CF) images pose significant challenges. Current CF to FFA translation methods are limited to static generation. In this work, we pioneer dynamic FFA video generation from static CF images. We introduce an autoregressive GAN for smooth, memory-saving frame-by-frame FFA synthesis. To enhance the focus on dynamic lesion changes in FFA regions, we design a knowledge mask based on clinical experience. Leveraging this mask, our approach integrates innovative knowledge mask-guided techniques, including knowledge-boosted attention, knowledge-aware discriminators, and mask-enhanced patchNCE loss, aimed at refining generation in critical areas and addressing the pixel misalignment challenge. Our method achieves the best FVD of 1503.21 and PSNR of 11.81 compared to other common video generation approaches. Human assessment by an ophthalmologist confirms its high generation quality. Notably, our knowledge mask surpasses supervised lesion segmentation masks, offering a promising non-invasive alternative to traditional FFA for research and clinical applications. The code is available at https://github.com/Michi-3000/Fundus2Video.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > South Carolina (0.04)
- (4 more...)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
Identifying Spurious Correlations using Counterfactual Alignment
Cohen, Joseph Paul, Blankemeier, Louis, Chaudhari, Akshay
Models driven by spurious correlations often yield poor generalization performance. We propose the counterfactual alignment method to detect and explore spurious correlations of black box classifiers. Counterfactual images generated with respect to one classifier can be input into other classifiers to see if they also induce changes in the outputs of these classifiers. The relationship between these responses can be quantified and used to identify specific instances where a spurious correlation exists as well as compute aggregate statistics over a dataset. Our work demonstrates the ability to detect spurious correlations in face attribute classifiers. This is validated by observing intuitive trends in a face attribute classifier as well as fabricating spurious correlations and detecting their presence, both visually and quantitatively. Further, utilizing the CF alignment method, we demonstrate that we can rectify spurious correlations identified in classifiers.